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Abstract 

It has previously been an open problem whether all Boolean submodular functions can be decomposed 
into a sum of binary submodular functions over a possibly larger set of variables. This problem has been 
considered within several different contexts in computer science, including computer vision, artificial 
intelligence, and pseudo-Boolean optimisation. Using a connection between the expressive power of 
valued constraints and certain algebraic properties of functions, we answer this question negatively. 

Our results have several corollaries. First, we characterise precisely which submodular functions of 
arity 4 can be expressed by binary submodular functions. Next, we identify a novel class of submodular 
functions of arbitrary arities which can be expressed by binary submodular functions, and therefore min- 
imised efficiently using a so-called expressibility reduction to the Min-Cut problem. More importantly, 
our results imply limitations on this kind of reduction and establish for the first time that it cannot be 
used in general to minimise arbitrary submodular functions. Finally, we refute a conjecture of Promislow 
and Young on the structure of the extreme rays of the cone of Boolean submodular functions. 

Keywords: Combinatorial optimisation, decomposition of submodular functions, expressive power, Gibbs 
energy minimisation, Markov Random Fields, min cut, multimorphisms, submodular pseudo-Boolean min- 
imisation, submodular polynomials, valued constraint satisfaction problems. 

1 Introduction 

1.1 Background 

A function / : 2 V — > R is called submodular if for all S, T C V, 

f(SDT) + f(SUT) < f(S) + f(T). 

Submodular functions are a key concept in operational research and combinatorial optimisation [39, 38, 48, 
47, 17, 33, 27]. Examples include cut capacity functions, matroid rank functions, and entropy functions. 
Submodular functions are often considered as a discrete analogue of convex functions [36]. 

Both minimising and maximising submodular functions, possibly under some additional conditions, have 
been considered extensively in the literature. Submodular function maximisation is easily shown to be NP- 
hard [ ] since it generalises many standard NP-hard problems such as the maximum cut problem. In contrast, 

* An earlier version of some parts of the results of this article appeared in Proceedings of the 14th International Conference on 
Principles and Practise of Constraint Programming (CP), 2008, pp. 112—127, and in Oxford University Computing Laboratory 
Technical Report CS-RR-08-08, June 2008. 



the problem of minimising a submodular function (SFM) can be solved efficiently with only polynomially 
many oracle calls, either by using the ellipsoid algorithm [20, 21], or by using one of several combinatorial 
algorithms that have been obtained in the last decade [46, 26, 24, 25, 40, 28]. The time complexity of the 
fastest known general algorithm for SFM is 0(n 6 + n 5 L), where n is the number of variables and L is the 
time required to evaluate the function [40]. 

The minimisation of submodular functions on sets is equivalent to the minimisation of submodular func- 
tions on distributive lattices [ ] . Krokhin and Larose have also studied the more general problem of min- 
imising submodular functions on non-distributive lattices [34]. 

An important and well-studied sub-problem of SFM is the minimisation of submodular functions of 
bounded arity (SFM;,), also known as locally defined submodular functions. In this scenario the submodular 
function to be minimised is defined as the sum of a collection of functions which each depend only on a 
bounded number of variables. Locally defined optimisation problems occur in a variety of contexts: 

• In the context of pseudo-Boolean optimisation, such problems involve the minimisation of Boolean 
polynomials of bounded degree [4] . 

• In the context of artificial intelligence, they have been studied as valued constraint satisfaction problems 
(VCSP) [44], also known as soft or weighted constraint satisfaction problems. 

• In the context of computer vision, such problems are often formulated as Gibbs energy minimisation 
problems [19] or Markov Random Fields [35, 49]. 

We will present our results primarily in the language of pseudo-Boolean optimisation. Hence an instance of 
SFMf, with n variables will be represented as a polynomial in n Boolean variables, of some fixed bounded 
degree. However, we will also mention the consequences of our results for constraint satisfaction problems 
and certain optimisation problems arising in computer vision. 

A general algorithm for SFM can always be used for the more restricted SFMf,, but the special features of 
this more restricted problem sometimes allow more efficient special-purpose algorithms to be used. (Note that 
we are focusing on exact algorithms which find an optimum solution.) In particular, it has been shown that 
certain cases can be solved much more efficiently by reducing to the Min-Cut problem, that is, the problem 
of finding a minimum cut in a directed graph which includes a given source vertex and excludes a given 
target vertex. For example, it has been known since 1965 that the minimisation of quadratic submodular 
polynomials is equivalent to finding a minimum cut in a corresponding directed graph [23, 4]. Hence quadratic 
submodular polynomials can be minimised in 0(n 3 ) time, where n is the number of variables. 

A similar approach, using a reduction to Min-Cut, can be used for any class of polynomials which 
can be decomposed into a sum of quadratic submodular polynomials, perhaps with additional variables to 
be minimised over. We will say that a polynomial that can be decomposed in this way is expressible by 
quadratic submodular polynomials (see Section 1.1). The following classes of functions have all been shown 
to be expressible in this way, over the past four decades: 

• polynomials where all terms of degree 2 or more have negative coefficients (also known as negative- 
positive polynomials) [43]; 

• cubic submodular polynomials [ ]; 

• {0, l}-valued submodular functions (also known as 2-monotone functions) [13, 10]; 

• binary submodular functions over non-Boolean domains [ ] (also known as Monge matrices [6]); 

• generalised 2-monotone functions [10]; 

• a class recently found by Zivny and Jeavons [52] and independently by Zalesky [51]. 

All these classes of functions have been shown to be expressible by quadratic submodular polynomials and 
hence can be minimised in cubic time. 

This series of positive expressibility results naturally raises the following question: 
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Problem 1. Are all submodular polynomials expressible by quadratic submodular polynomials? 

Each of the above expressibility results was obtained by an ad-hoc construction, and no general technique 
has previously been proposed which is sufficiently powerful to address Problem 1. 

1.2 Contributions 

Cohen et al. recently developed a novel algebraic approach to characterising the expressive power of valued 
constraints in terms of certain algebraic properties of those constraints [7]. 

Using this systematic algebraic approach we are able to give a negative answer to Problem 1: we show that 
there are quartic submodular polynomials which are not expressible by quadratic submodular polynomials. 
More precisely, we characterise exactly which quartic submodular polynomials are expressible by quadratic 
submodular polynomials and which are not. In addition, we show that any quartic submodular polynomial 
is either expressible by quadratic submodular polynomials with only linearly many extra variables, or it is 
not expressible at all. 

On the way to establishing this result we show that two broad families of submodular functions known as 
upper fans and lower fans are all expressible by binary submodular functions. This provides a new class of 
submodular polynomials of all arities which are expressible by quadratic submodular polynomials and hence 
solvable efficiently by reduction to Min-Cut. We use the expressibility of this family, and the existence of 
non-expressible functions, to refute a conjecture from [41] on the structure of the extreme rays of the cone of 
Boolean submodular functions, and suggest a more refined conjecture of our own. 

1.3 Applications 

The concept of submodularity is important in a wide variety of fields within computer science; in this paper 
we briefly discuss two of these: artificial intelligence and computer vision. Our results can be directly applied 
to both of these areas, as we show in Section 3.4 below. 

Artificial Intelligence A major area of investigation in artificial intelligence is the Constraint Satisfaction 
problem (CSP) [44]. A number of extensions have been added to the basic CSP framework to deal with 
questions of optimisation, including semi-ring CSPs, valued CSPs, soft CSPs and weighted CSPs. These 
extended frameworks can be used to model a wide range of discrete optimisation problems [45, 3, 44], 
including standard problems such as Min-Cut, Max-Sat, Max-Ones Sat, Max-CSP [13, 11], and Min- 

COST HOMOMORPHISM [22]. 

The differences between the various frameworks are not relevant for our purposes, so we will simply 
focus on one very general framework, the valued constraint satisfaction problem or VCSP. Informally, in 
the VCSP framework, an instance consists of a set of variables, a set of possible values for those variables, 
and a set of constraints. Each constraint has an associated cost function which assigns a cost (or degree of 
violation) to every possible tuple of values for the variables in the scope of the constraint. The goal is to find 
an assignment of values to all of the variables which has the minimum total cost. 

The class of constraints with submodular cost functions is the only non-trivial tractable class of optimi- 
sation problems in the dichotomy classification of the Boolean VCSP [I I], and the only tractable class in 
the dichotomy classification of the Max-CSP problem for both 3-element sets [30] and arbitrary finite sets 
allowing constant (that is, fixed-value) constraints [14]. 

Cohen et al. showed that VCSP instances with submodular constraints over an arbitrary finite domain 
can be reduced to SFM [11], and hence can be solved in polynomial time. This tractability result has since 
been generalised to a wider class of valued constraints over arbitrary finite domains known as tournament- 
pair constraints [ ]. An alternative approach to solving VCSP instances with bounded-arity submodular 
constraints, based on linear programming, can be found in [12]. 

Computer Vision Gibbs energy minimisation and Markov Random Fields, play an important role in 
computer vision as they are applicable to a wide variety of vision problems, including image restoration, 
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stereo vision and motion tracking, image synthesis, image segmentation, multi-camera scene reconstruction 
and medical imaging [32]. Reducing energy minimisation to the Min-Cut problem has recently become a 
very popular approach, leading to the rediscovery of the property of submodularity [32, 16], and showing that 
certain special classes of functions can be minimised using graph cuts by introducing extra variables [42, 31]. 

Our results below characterise precisely which 4-ary submodular functions can be minimised using graph 
cuts in this way and which cannot. We also provide a new class of submodular functions of arbitrary arity 
which can be minimised efficiently in this way. 

2 Preliminaries 

In this section, we introduce the basic definitions and the main tools used throughout the paper. 
2.1 Cost functions and expressibility 

We denote by K the set of all real numbers together with (positive) infinity. For any fixed set D, a function 
(j) from D n to M will be called a cost function on D of arity n. If the range of (j> lies entirely within K, then 
<f) is called a finite-valued cost function. If the range of cj> is {0,oo}, then <fr can be viewed as a predicate, or 
relation, allowing just those tuples t <G D n for which <f>(t) = 0. 

Cost functions can be added and multiplied by arbitrary real values, hence for any given set of cost 
functions, T, we can define the convex cone generated by T, as follows. 

Definition 2.1. For any set of cost functions T, the cone generated by T, denoted Cone(r), is defined by: 

Cone(r) = {ai<f>i + ■ ■ ■ a r <fi r | r > 1; <f>\, . . . , (f> r € T; a\, . . . a r > 0}. 

Definition 2.2. A cost function cj> of arity n is said to be expressible by a set of cost functions T if <fi — 
TC ^ xl yi,-..,y4 4>'{ x ii ■ ■ ■ i x ni Vlt ■ ■ ■ j Uj) + K ; f or some <fi' £ Cone(T) and some constant k. 

The variables yi,...,yj are called extra (or hidden ) variables, and <fi' is called a gadget for <j> over T. 

Note that in the special case of relations this notion of expressibility corresponds to the standard notion 
of expressibility using conjunction and existential quantification (primitive positive formulas) [5]. 

We denote by (r) the expressive power of T, which is the set of all cost functions expressible by T. 

It was shown in [7] that the expressive power of a set of cost functions is determined by certain algebraic 
properties of those cost functions called fractional polymorphisms. For the results of this paper, we will 
only need a certain subset of these algebraic properties, called multimorphisms [ ]. These are defined in 
Definition 2.3 below, which is illustrated in Figure 1. 

The i-th component of a tuple t will be denoted by t[i] . Note that any operation on a set D can be extended 
to tuples over the set D in a standard way, as follows. For any function / : D k — > D, and any collection of 
tuples h,...,t k €D n , define f(t u . . . , t k ) € D n to be the tuple </(*i[l],. • • , t k [l]), . . . , f(h[n], . . . , t k [n])). 

Definition 2.3 ([11]). Let T : D k — > D k be the function whose k-tuple of output values is given by the tuple 
of functions J- = (/i, . . . , fk), where each f i : D k — > D. 

For any n-ary cost function <j), we say that T is a k-ary multimorphism of <p if, for all t\, . . . ,t k € D 71 , 

k k 

$>(**) > $>(/;(*i,... ,**))• 

For any set of cost functions, T, we will say that is a multimorphism of T if T is a multimorphism of 
every cost function in V. The set of all multimorphisms of V will be denoted Mul(r). 

Note that multimorphisms are preserved under expressibility. In other words, if T G Mul(r), and <f> G (T), 
then T € Mul({0}) [11, 7]. This has two important corollaries. First, if (Ti) = (T 2 ), then Mul(r x ) = Mul(r 2 ). 
Second, if there exists T € Mul(r) such that T £ Mul({0}), then (j> is not expressible over V, that is, <f> ^ (T). 
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h 
t 2 

A = h(h, 

t' k = fk(h, 

Figure 1: Inequality establishing T — (fx, . . . , /&) as a multimorphism of cost function 4> (sec Definition 2.3). 
2.2 Lattices and submodularity 

Recall that L is a lattice if L is a partially ordered set in which every pair of elements (a, b) has a unique 
supremum (the least upper bound of a and b, called the join, denoted a V b) and a unique infimum (the 
greatest lower bound, called the meet, denoted a A b). 

For any lattice-ordered set D, a cost function (j> : D n — > R is called submodular if for every u,v E D m , 
(f)(mia(u, v)) + </>(max(it, v)) < </>(u) + 0(f) where both min and max are applied coordinate-wise on tuples 
u and v [39]. This standard definition can be reformulated very simply in terms of multimorphisms: <f> is 
submodular if (min, max) £ Mul({0}). 

Using results from [47] and [11], it can be shown that any submodular cost function cj> can be expressed 
as the sum of a finite- valued submodular cost function (f>fi n , and a submodular relation (f> re i, that is, <f> = 

(j>fin + 4>rel ■ 

Moreover, it is known that all submodular relations are binary decomposable [29], and hence expressible 
using only binary submodular relations. Therefore, when considering which cost functions are expressible by 
binary submodular cost functions, we can restrict our attention to finite-valued cost functions without any 
loss of generality. 

Next we define some particular families of submodular cost functions, first described in [ ], which will 
turn out to play a central role in our analysis. 

Definition 2.4. Let L be a lattice. We define the following cost functions on L: 

• For any set F of pairwise incomparable elements (ai,...,a TO ) C L, such that each pair of distinct 
elements (a.i,aj) has the same least upper bound, \J F, the following cost function is called an upper 
fan: 

f-2 ifx>\jF, 
4>f{x) = \ —1 if x~]t\] F, but x > a,i for some i, 
I otherwise. 

• For any set G of pairwise incomparable elements (a\,...,a m ) C L, such that each pair of distinct 
elements (a^a^) has the same greatest lower bound, f\G, the following cost function is called a lower 
fan: 

f-2 ifx</\G, 
4>g{x) = < — 1 if x ^ f\G, but x < ai for some i, 
I otherwise. 

We call a cost function a fan if it is either an upper fan or a lower fan. It is not hard to show that all 
fans are submodular [41]. 

Note that our definition of fans is slightly more general than the definition in [ ]. In particular, we allow 
the set F to be empty, in which case the corresponding upper fan ^ is a constant function. 
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2.3 Boolean cost functions and polynomials 

In this paper we will focus on problems over Boolean domains, that is, where D = {0, 1}. 

Any cost function of arity n can be represented as a table of values of size D n . Moreover, a finite- 
valued cost function <fr : D n -> 1 on a Boolean domain D = {0, 1} can also be represented as a unique 
polynomial in n (Boolean) variables with coefficients from R (such functions are sometimes called pseudo- 
Boolean functions [ ]). Hence, in what follows, we will often refer to a finite- valued cost function on a Boolean 
domain and its corresponding polynomial interchangeably. 

For polynomials over Boolean variables there is a standard way to define derivatives of each order (see [4]). 
For example, the second order derivative of a polynomial p, with respect to the first two indices, denoted 
5j. a(x), is defined as p(l, l,x) — 0,x) — p(Q, 1, x) + p(0, 0, x). Analogously for all other pairs of indices. It 
was shown in [ ] that a polynomial p(xi, . . . , x n ) over Boolean variables x±, . . . , x n represents a submodular 
cost function if and only if its second order derivatives (Jjj(x) are non-positive for all 1 < i < j < n and all 
x G D n ~ 2 . An immediate corollary is that a quadratic polynomial represents a submodular cost function if 
and only if the coefficients of all quadratic terms are non-positive. 

Note that a cost function is called supermodular if all its second order derivatives are non-negative. 
Clearly, / is submodular if and only if — / is supermodular. Cost functions which are both submodular 
and supermodular (in other words, all second order derivatives arc equal to zero) are called modular, and 
polynomials corresponding to modular cost functions are linear [4]. 

Example 2.5. For any set of indices I = {ii, . . . , i m } C {1, . . . , n} we can define a cost function <jf>j in n 
variables as follows: 

' -4 if(Viel)( Xi = l), 

otherwise. 



<pi(xi, . . . ,x n ) = 



The polynomial representation of 4>i is p(xi, . . . , x n ) — —x^ . . . Xi m , which is a polynomial of degree m. Note 
that it is straightforward to verify that (j>j is submodular by checking the second order derivatives of p. 

However, the function <j)j is also expressible by quadratic polynomials, using a single extra variable, y, as 
follows: 

4> I (x 1 ,...,x n )= min {-y + y VVl -£;)}■ 
ye{o,i} *ri 

i£l 

We remark that this is a special case of the expressibility result for negative-positive polynomials first obtained 
in [43]. 

Note that when D = {0, 1}, the set D n with the product ordering is isomorphic to the lattice of all subsets 
of an n-element set ordered by inclusion. Hence, a cost function on a Boolean domain can be viewed as a cost 
function defined on a lattice of subsets, and we can apply Definition 2.4 to identify certain Boolean functions 
as upper fans or lower fans, as the following example indicates. 

Example 2.6. Let F = . . . , I r } be a set of subsets of {1, 2, . . . , n} such that for all i ^ j we have Ii % Ij 
and IiUlj = {J F - 

By Definition 2.4, the corresponding upper fan function 4>f has the following polynomial representation: 
p(xi, ...,x n ) = (r-2) J| x. L - Y[ %i Y\. x% - 

i£\JF ieli iel r 

We remark that any permutation of a set D gives rise to an automorphism of cost functions over D. In 
particular, for any cost function / on a Boolean domain D, the dual of / is the corresponding cost function 
which results from exchanging the values and 1 for all variables. In other words, if p is the polynomial 
representation of /, then the dual of / is the cost function whose polynomial representation is obtained 
from p by replacing all variables x with 1 — x. Observe that, due to symmetry, taking the dual preserves 
submodularity and expressibility by binary submodular cost functions. 

It is not hard to see that upper fans are duals of lower fans and vice versa. 
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3 Results 

In this section, we present our main results. First, we show that fans of all arities are expressible by binary 
submodular cost functions. Next, we characterise the multimorphisms of binary submodular cost functions. 
Finally, combining these results together, we characterise precisely which 4-ary submodular cost functions 
are expressible by binary submodular cost functions. More importantly, we show that some submodular cost 
functions are not expressible by binary submodular cost functions, and therefore cannot be minimised using 
the Min-Cut problem via an expressibility reduction. Finally, we describe some applications of these results 
to valued constraint satisfaction problems and certain optimisation problems arising in computer vision. 

3.1 Expressibility of upper fans and lower fans 

We denote by r su b iP the set of all finite-valued submodular cost functions of arity at most n on a Boolean 
domain D, and we set r su b = Un -^su^n- 

We denote by Ff ans n the set of all fans of arity at most n on a Boolean domain D, and we set Ff ans — 



Our next result shows that Ff ans C (r^t,^)- 

Theorem 3.1. Any fan on a Boolean domain D is expressible by binary submodular functions on D using 
at most 1 + [m/2j extra variables, where m is the degree of its polynomial representation. 

Proof. Since upper fans are dual to lower fans, it is sufficient to establish the result for upper fans only. 

Let F = {Ii, . . . , I r } be a set of subsets of {1, 2, . . . , n} such that for all i ^ j we have Ii % Ij and 
Ii U Ij = [JF, and let cf>p be the corresponding upper fan, as specified by Definition 2.4. The polynomial 
representation of <pp, p{x\, ■ • • , x n ), is given in Example 2.6. 

The degree of p is equal to the total number of variables occurring in it, which will be denoted m. Note 
that m=\\jF\. 

If r = 0, then <fip is constant, so the result holds trivially. If r = 1, we have F = {I}, where I — . . . , i m } 
and the polynomial representation of cj)p is —2xi x Xi 2 ■ ■ ■ Xi m . In this case, it was shown in Example 2.5 that 
4>f can be expressed by quadratic functions using one extra variable, as follows: 



For the case when r > 1, we first note that any i G [JF must belong to all the elements of F except 
for at most one (otherwise there would be two elements of F, say Ii and Ij, such that Ii U Ij ^ [j F, which 
contradicts the choice of F) . 

We will say that two elements of IJ F are equivalent if they occur in exactly the same elements of F, that 
is, ii,i2 € IJ F are equivalent if i\ G Ij <=!• 12 £ Ij for all j € {«,••■ ,r}. Equivalent elements i\ and ii of 
IJ F can be merged by replacing them with a single new element. In the polynomial representation of <f>p 
this corresponds to replacing the variables Xi 1 and Xi 2 with a single new variable, z, corresponding to their 
product. Note that the number of equivalence classes of size two or greater is at most L m /2J ■ 

After completing all such merging, we obtain a new set F' = {I[, . . . , I' r ,} with the property that \I'A = 
m! — 1 for every i, where m' = \ [JF'\ is the size of the common join of any I[,I'j € F' . This set has a 
corresponding new upper fan, 4>f>, over the new merged variables. 

To complete the proof we will construct a simple gadget for expressing (j)pi, and show how to use this to 
obtain a gadget for expressing the original upper fan (pp. 

Note that the sets 7| are subsets of (J F' , each of size m! — 1 . Any such subset is uniquely determined by 
its single missing element. We denote by K the set of elements occurring in all sets I[ and by L the set of 
elements which are missing from one of these subsets. Clearly, \K\ + \L\ = m! . We claim that the following 
polynomial is a gadget for expressing <j>' F : 



U„r f; 



ans,n ■ 
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To establish this claim, we will compute the value of p' , for each possible assignment to the variables 
z m i. Denote by fco the number of Os assigned to variables in K, and by the number of Os as- 
signed to variables in L. Then we have: 

p'(zi, . . . ,z m >) = min y(2m' - 2 - \L\ - } z% - 2 } zij 

= min y(2m' - 2 - \L\ - (\L\ - l ) - 2(m' - \L\ - k ) 
ye{o,i} 

= min y(2m' - 2 - 2\L\ + l Q - 2m' + 2\L\ + 2k ) 
ye{o : i} 

= min y(— 2 + 2k + l ). 
ye{o,i} 

Hence if fc = h = 0, then p' takes the value -2. If fc = and Iq = 1, then p' takes the value -1. In all other 
cases (that is, ko > or Iq > 1), p' takes the value 0. By Definition 2.4, this means that p' is the (unique) 
polynomial representation for 4>F'- Note that p' uses just one extra variable, y. 

Finally, we show how to obtain a gadget for the original upper fan 4>p, from the polynomial p' . Each 
variable in p' represents an equivalence class of elements of (J F, so it can be replaced by a term consisting 
of the product of the variables in this equivalence class. In this way we obtain a new polynomial over the 
original variables containing linear and negative quadratic terms together with negative higher order terms 
(cubic or above) corresponding to every equivalence class with 2 or more elements. However, each of these 
higher order terms can itself be expressed by a quadratic submodular polynomial, by introducing a single 
extra variable, as shown in the case when r — 1, above. Therefore, combining each of these polynomials, the 
total number of new variables introduced is at most 1 + \_m/2\. □ 

Many of the earlier expressibility results mentioned in Section 1.1 can be obtained as simple corollaries 
of Theorem 3.1, as the following examples indicate. 

Example 3.2. Any negative monomial —X\Xi ■ ■ ■ x m is a positive multiple of an upper fan, and the positive 
linear monomial X\ is equal to —(1 — Xi) + 1, so it is a positive multiple of a lower fan, plus a constant. 
Hence, by Theorem 3.1, all negative-positive submodular polynomials are expressible by quadratic submodular 
polynomials, as originally shown in [43]. 

Example 3.3. Any cubic submodular polynomial can be expressed as a positive sum of upper fans [j J. Hence, 
by Theorem 3.1, all cubic submodular polynomials are expressible by quadratic submodular polynomials, as 
originally shown in [2j. 

Example 3.4. A Boolean cost function <fi is called 2-monotone [13] if there exist two sets A, B C {1, . . . , n} 
such that 0(x) = 0i/iCxorxCJj and </>(x) = 1 otherwise (where A C x means Vi € = 1 and 

x C B means \/i ^ B,a:[i] = 0). It was shown in [ , Proposition 2.9] that a 2-valued Boolean cost function 
is 2-monotone if and only if it is submodular. 1 

For any 2-monotone cost function defined by the sets of indices A and B, it is straightforward to check 
that 4> — minygjo,!} vO- + <^f/2) + (1 — y)(l + </>g/2) where 4>f is the upper fan defined by F = {A} and 4>G 
is the lower fan defined by G = {B}. Note that the function y4>F is an upper fan, and the function (1 — y)4>c 
is a lower fan. Hence, by Theorem 3.1, all 2-monotone polynomials are expressible by quadratic submodular 
polynomials, and solvable by reduction to Min-Cut, as originally shown in [13]. 

However, Theorem 3.1 also provides many new functions of all arities which have not previously been 
shown to be expressible by quadratic submodular functions, as the following example indicates. 

Example 3.5. The function 2x1X2X3X4 — X1X2X3 — X1X2X4 — X1X3X4 — X2X3X4 belongs to rf anSj 4, but does not 
belong to any class of submodular functions which has previously been shown to be expressible by quadratic 
submodular functions. In particular, it does not belong to the class T new identified in [52, 51]. 

1 In fact, [10] studied supermodular cost functions, but as / is supcrmodular if and only if — / is submodular, the results 
translate easily. 
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3.2 Characterising Mul(r sub)2 ) 

Since we have seen that a cost function can only be expressed by a given set of cost functions if it has the 
same multimorphisms, we now investigate the multimorphisms of r su b,2- 

A function T : D k — > D k is called conservative if, for each possible choice of x%, . . . , Xk, the tuple 
!F{x\, . . . , Xk) contains the same multi-set of values, x\, . . . , Xk (in some order). 

For any two tuples x = (xi, . . . , xj.) and y = (j/i, . . . , yj.) over D, we denote by -ff(x, y) the Hamming 
distance between x and y, which is the number of positions at which the corresponding values arc different. 

Theorem 3.6. For any Boolean domain D, and any T : D k — > D k , the following are equivalent: 

1. Mui(r 5Ub , 2 ). 

2. T G Mul(r^ b 2 ), where r^ b 2 denotes the set of binary submodular cost functions taking finite or infinite 
values. 

3. T is conservative and Hamming distance non-increasing. 

Proof. First we consider unary cost functions. All unary cost functions on a Boolean domain are easily shown 
to be submodular. Also, any conservative function T : D k — > D k is clearly a multimorphism of any unary 
cost function, since it merely permutes its arguments. 

For any d £ D and c G R, define the unary cost function /if as follows: 



c if x = d, 
if x ^ d. 



Let F : D k — ► D k be a non-conservative function. In that case, there are u\, . . . , Uk, Vi, . . . , t>& G D 
such that !F{ui, . . . , Uk) — (vi, . . . , Vk) and there is i such that occurs more often in (v±, . . . , Vk) than in 
(tti, . . . , Uk). It is simple to check that T is not a multimorphism of the unary cost function Hence any 
T G Mul(r su b,2) must be conservative. 

By the same argument, any T G Mul(r^ b 2 ) must be conservative. 

For any c G K, define the binary cost functions A c and Xc as follows: 

Jc if a; = and y = 1, \ c if x ^ y, 

A c (x,y) — < Xc\ x >y) = \ 

I otherwise. I otherwise. 

Note that \c(x, y) = X c (x, y) + X c (y, x). 

By a simple case analysis, it is straightforward to check that any binary submodular cost function on a 
Boolean domain can be expressed by binary functions of the form A c , with c > together with unary cost 
functions of the form fi^. 

We observe that when c < oo, X c (x,y) — (x c (x,y) + fJ-c( x ) + ^Kv) ~ c )/2, so A c can be expressed by 
functions of the form Xc together with unary cost functions of the form Hence, since expressibility 
preserves multimorphisms, Mul(r sub ,2) = Mul({x c | c G R, c > 0}) n Mu\({n* | c G M, d G D}). 

Now let u, v G D k , and consider the multimorphism inequality, as given in Definition 2.3, for the case 
where t{ = (u[i], v[i]), for i = 1, . . ., k. By Definition 2.3, for any c > 0, T is a multimorphism of Xc if and 
only if the following holds for all choices of u and v: 

ff(u,v)> J7(^(u),.F(v)). 

This proves that the multimorphisms of T^b^ are precisely the conservative functions which are also Hamming 
distance non-increasing. 

Since r subi 2 Q T^b2' we know that Mul(r^ b2 ) C Mul(r su b.2)- Therefore, in order to complete the 
proof it is enough to show that every conservative and Hamming distance non-increasing function J 7 is a 
multimorphism of Aqo. 
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For any u, v G {0, l} k , the Hamming distance H(u, v) is equal to the symmetric difference of the sets of 
positions where u and v take the value 1. Hence, for tuples u and v containing some fixed number of Is, the 
minimum Hamming distance occurs precisely when one of these sets of positions is contained in the other. 

Now consider again the multimorphism inequality, as given in Definition 2.3, for the case where tj = 
(u[i], v[i]), for i = 1, . . . ,k. If there is any position i where u[i] — and v[i] = 1, then Aoo(^) = oo, so the 
multimorphism inequality is trivially satisfied. If there is no such position, then the set of positions where v 
takes the value 1 is contained in the set of positions where u takes the value 1, so H(u, v) takes its minimum 
possible value over all reorderings of u and v. Hence if T is conservative, then H(u, v) < i?(jF(u), ^(v)), 
and if T is Hamming distance non-increasing, we have H(u,v) — H (jF(u), JF(v)). But this implies that the 
set of positions where ^(v) takes the value 1 is contained in the set of positions where T{u) takes the value 
1. By definition of Aoo, this implies that both sides of the multimorphism inequality are zero, so J 7 is a 
multimorphism of A^. □ 

3.3 Non-expressibility of r su b over r su t,,2 

Consider the (carefully chosen) function T sep : {0, l} 5 — > {0, l} 5 defined in Figure 2. We will show in this 
section that this particular function can be used to characterise all the submodular functions of arity 4 which 
are expressible by binary submodular functions on a Boolean domain, and hence show that some submodular 
functions are not expressible. 
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Figure 2: Definition of T s , 



Proposition 3.7. J- sep is conservative and Hamming distance non-increasing. 

Proof. Straightforward exhaustive verification. □ 
Theorem 3.8. For any function f G r su b.4 the following are equivalent: 

i- fe <r 5Ub , 2 ) ; 

2. T sep G Mul({/}); 

3. f G Cone(r fans , 4 ). 

Proof. Proposition 3.7 and Theorem 3.6 imply that J- sep is a multimorphism of any binary submodular 
function on a Boolean domain. Hence having T sep as a multimorphism is a necessary condition for any 
submodular cost function on a Boolean domain to be expressible by binary submodular cost functions. 

We will now complete the proof by showing that for 4-ary submodular cost functions on a Boolean domain 
having T sep as a multimorphism is also sufficient to ensure expressibility by binary cost functions. 

We consider the complete set of inequalities on the values of a 4-ary cost function resulting from having 
the multimorphism J- sep , as specified in Definition 2.3. Out of 16 5 such inequalities, there are 4635 which 
are distinct. After removing from these all those which are equal to the sum of two others, we obtain a 
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system of just 30 inequalities which must be satisfied by any 4-ary submodular cost function which has the 
multimorphism T sep . Using the double description method 2 [ ] we obtain from these 30 inequalities an 
equivalent set of 31 extreme rays which generate the same polyhedral cone of cost functions. These extreme 
rays all correspond to fans or sums of fans, and hence are expressible over r su b ; 2, by Theorem 3.1. It follows 
that any cost function in this cone of functions is also expressible over r su b,2- D 

Next we show that there are indeed 4-ary submodular cost functions which do not have T sep as a multi- 
morphism and therefore are not expressible by binary submodular cost functions. 

Definition 3.9. For any Boolean tuple t of arity 4 containing exactly 2 ones and two zeros, we define the 
4-ary cost function 9t as follows: 

-1 if (xi,X2,x 3 , X4) = (1,1,1,1) or (0,0,0,0), 
9 t (x 1 ,X2,x 3 ,x 4 ) = { 1 if (xi,x 2 ,x 3 ,x i ) = t, 
otherwise. 



Cost functions of the form 9 t were introduced in [41], where they are called quasi-indecomposable functions. 
We denote by r q j n the set of all (six) quasi-indecomposable cost functions of arity 4. It is straightforward to 
check that they are submodular, but the next result shows that they are not expressible by binary submodular 
functions. 

Proposition 3.10. For all 9 € T qm , T sep <£ Mul({6»}). 

Proof. The table in Figure 3 shows that T sep ^ Muld^ ^g )}). Permuting the columns appropriately 
establishes the result for all other 9 E T qin . □ 
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Figure 3: T sep $ Mul({0 (1 , 1>O)O) }). 



Corollary 3.11. For all 9 S r qin; 9 £ {T slsb , 2 )- 

Proof. By Theorem 3.8 and Proposition 3.10. □ 

Are there any other 4-ary submodular cost functions which are not expressible over r su b,2? Promislow 
and Young characterised the extreme rays of the cone of all 4-ary submodular 3 cost functions and established 
that r su b.4 = Cone(Ff ans 4 U T q i n ) - see Theorem 5.2 of [41]. Hence the results in this section characterise the 
expressibility of all 4-ary submodular functions. 

Promislow and Young conjectured that for k ^ 4, all extreme rays of r^^k are fans [41]. However, if 
this conjecture were true it would imply that all submodular functions of arity 5 and above were expressible 
by binary submodular functions, by Theorem 3.1. This is clearly not the case, because inexpressible cost 

2 As implemented, for example, by the program Skeleton available from http://www.uic.nnov.ru/~zny/skeleton/ 
3 In fact, [11] studied supermodular cost functions, but as / is supermodular if and only if — / is submodular, the results 
translate easily. 
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functions such as those identified in Corollary 3.11 can be extended to larger arities (e.g., by adding dummy 
arguments) and remain inexpressible. Hence our results refute this conjecture. However, we suggest that 
this conjecture can be refined to a similar statement concerning just those submodular functions which are 
expressible by binary submodular functions, as follows: 

Conjecture 3.12. For all k, r su b,k H (r su b,2) = Cone(rf an5! k). 

This conjecture was previously known to be true for k < 3 [41]; Theorem 3.8 confirms that it holds for 
k = 4. 

Next we show that we can test efficiently whether a submodular polynomial of degree 4 is expressible by 
quadratic submodular polynomials. 

Definition 3.13. Let p(xi,X2,X3,X4) be the polynomial representation of a 4~ary submodular cost function 
f. We denote by ai the coefficient of the term Yiiei x i- We say that f satisfies condition Sep if for each 
{i, j}, {k,l} C {1,2,3,4}, with i,j,k,l distinct, we have auj} + a {k,i} + a U,j,k} + a {i.j,i} — 0- 

Theorem 3.14. For any f S r 5u b,4, the following are equivalent: 

1. fe <r SJb , 2 ) 

2. f satisfies condition Sep. 

Proof. As in the proof of Theorem 3.8, we can construct a set of 30 inequalities corresponding to the multi- 
morphism J- sep . Each of these inequalities on the values of a cost function can be translated into inequalities 
on the coefficients of the corresponding polynomial representation. 24 of them impose the condition of sub- 
modularity, and the remaining 6 inequalities impose condition Sep. Hence a submodular cost function of 
arity 4 has the multimorphism T sep if and only if its polynomial representation satisfies condition Sep. The 
result then follows from Theorem 3.8. □ 

Corollary 3.15. Given a submodular polynomial p of degree 4, condition Sep can be used to test in polynomial 
time whether p is expressible by quadratic submodular polynomials. 

In contrast to this result, it is known that the recognition problem for submodular polynomials of degree 
4 is co-NP-complete [18]. Given an arbitrary polynomial of degree 4, condition Sep recognises expressible 
polynomials under the assumption that the polynomial is submodular. One might hope that submodular 
polynomials which are expressible by quadratic submodular polynomials would be recognisable in polynomial 
time. Unfortunately, this is not the case. In fact, as all polynomials of degree 4 used in the reduction given 
in [18] satisfy condition Sep, the original reduction from [18] proves the following: 

Proposition 3.16. Given an arbitrary polynomial p of degree 4, it is co-NP-complete to test whether p is a 
submodular polynomial which is expressible by quadratic submodular polynomials. 

3.4 Applications 

As mentioned above, testing submodularity is co-NP-complete even for polynomials of degree 4 [18]. However, 
for many of the optimisation problems arising in practice, testing for submodularity is not an issue because 
the function to be minimised is presented as a sum of functions of bounded arity. In such cases, each of the 
bounded-arity sub-functions can be tested for submodularity in constant time. For example, in constraint 
satisfaction problems and computer vision, each instance is specified as a sum of bounded-arity functions 
and these can be independently tested for submodularity. The recognition of submodularity only becomes 
co-NP-complete when a function is presented without a fixed decomposition into sub-functions of this kind. 
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Artificial Intelligence First we formally define valued constraint satisfaction problems [45, 3, 44]. 

Definition 3.17. An instance V of VCSP is a triple (V,D,C), where V is a finite set of variables, which 
are to be assigned values from the set D, and C is a set of valued constraints. Each c £ C is a pair c — (a, (/>), 
where a is a tuple of variables of length \o~\, called the scope of c, and (f> : — > R is a cost function. An 
assignment for the instance V is a mapping s from V to D. The cost of an assignment s is defined as follows: 

C0St V (s)= ^ </>((s(vi),s(v 2 ),. ■■ ,s(v m ))). 

((«l,l>2,---,l>m),</>)6C 

A solution to V is an assignment with minimum cost. 

Now we show how our results can be applied in this framework. 

Corollary 3.18 (of Theorem 3.1). VCSP(rf ans ) is solvable in 0((n + k) 3 ) time, where where n is the number 
of variables and k is the number of higher-order (ternary and above) constraints. 

Moreover, as shown above, VCSP(Ff an5i 4) is the maximal class in VCSP(T su b,4) which can be solved by 
reduction to Min-Cut in this way. 

Cohen et al. [7] showed that if a cost function cf> of arity k is expressible by some set of cost functions 
over r, then </> is expressible by T using at most 2 2> * extra variables. Our results show that only 0(k) extra 
variables are needed to express any cost function from rf anSj k by T^t,^- Therefore, an instance of VCSP(Tf ans ) 
needs only linearly many (in the number of higher-order constraints) extra variables, where the linear factor 
is proportional to the maximum arity of the constraints. In particular, an instance of VCSP(r su b i 4) is cither 
reducible to Min-Cut with only linearly many extra variables, 4 or is not reducible at all. 

Computer Vision In computer vision, many problems can be naturally formulated in terms of energy 
minimisation where the energy function, over a set of variables {x v } v< £Vi has the following form: 

E(x) = Cp + c v (x v ) + } c uv (x Ul x v ) + . . . 

v£V {u,v)eVxV 

Set V usually corresponds to pixels, x v denotes the label of of pixel v € V which must belong to a finite 
domain D. The constant term of the energy is Co, the unary terms c v (-) encode data penalty functions, the 
pairwise terms c uv {-, •) are interaction potentials, and so on. Functions of arity 3 and above are also called 
higher-order cliques. This energy is often derived in the context of Markov Random Fields [19, 1]: a minimum 
of E corresponds to a maximum a-posteriori (MAP) labelling x [35, 49]. 

It is straightforward that this is equivalent to VCSP. See [50] for a survey on the connection between 
computer vision and constraint satisfaction problems. Therefore, for energy minimisation over Boolean 
variables we get the following: 

Corollary 3.19 (of Theorem 3.1). Energy minimisation, where each term of the energy function belongs to 
Tfans; i s solvable in 0((n + k) 3 ) time, where where n is the number of variables (pixels) and k is the number 
of higher- order (ternary and above) terms in the energy function. 

Note that any variable over a non-Boolean domain D = {0, l,...,d— 1} of size d can be encoded by 
d — 1 Boolean variables. One such encoding is the following: en(i) = d_4_1 l l . We replace each vari- 
able with d — 1 new Boolean variables and impose a (submodular) relation on these new variables which 
ensures that they only take values in the range of the encoding function en. Note that en(max(a, b)) = 
max(en(a), en(b)) and en(min(a,6)) = min(en(a), en(b)), so this encoding preserves submodularity. Observe 
that any submodularity-preserving encoding of a non-Boolean variable by Boolean variables needs at least 
0(d) variables. However, for practical purposes, subclasses of non-Boolean submodular functions which can 
be encoded by Boolean submodular functions with fewer variables have been studied, as well as approximation 
algorithms for these problems [42, 31]. 

4 Optimal (in the number of extra variables) gadgets for cost functions from rf ans 4 were shown in [53]. 
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